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OO ■ Abstract 

o ■ 

f^ ' In the multiple testing context, a challenging problem is the estimation of the proportion ttq of 

rsQ I true-null hypotheses. A large number of estimators of this quantity rely on identifiability assumptions 

that either appear to be violated on real data, or may be at least relaxed. Under independence, we 
O .' propose an estimator ttq based on density estimation using both histograms and cross-validation. 

.^^ , Due to the strong connection between the false discovery rate (FDR) and ttq, many multiple testing 

procedures (MTP) designed to control the FDR may be improved by introducing an estimator of ttq. 
OO ' We provide an example of such an improvement (plug-in MTP) based on the procedure of Benjamini 

and Hochberg. Asymptotic optimality results may be derived for both ttq and the resulting plug-in 

procedure. The latter ensures the desired asymptotic control of the FDR, while it is more powerful 

than the BH-procedure. 

Finally, we compare our estimator of ttq with other widespread estimators in a wide range of simu- 
tC ' lations. We obtain better results than other tested methods in terms of mean square error (MSE) 

jrt I of the proposed estimator. Finally, both asymptotic optimality results and the interest in tightly 

estimating ttq are confirmed (empirically) by results obtained with the plug-in MTP. 

Keywords: multiple testing, false discovery rate, density estimation, histograms, cross-validation 

Q^ ■ Introduction 

OO ■ , , 

Multiple testing problems arise as soon as several hypotheses are tested simultaneously. Like in test 

theory, we are concerned with the control of type-I errors we may commit in falsely rejecting any tested 
^+ ■ hypothesis. Post-genomics, astrophysics or neuroimaging are typical areas in which multiple testing 

^^ I problems are encountered. For all these domains, the number of tests may be of the order of several 

00 . thousands. Suppose we are testing each of ra hypotheses at level < a < 1, the probability of at least 

one false positive {e.g. false rejection) may equal ma in the worst case. A possible way to cope with this 

is to use the Bonferroni procedure (|8l]), which consists in testing each hypothesis at level a/m. However, 

this method is known to be drastically conservative. 

Since we may be more interested in controlling the proportion of false positives among rejections rather 
C^ ' than the total number of false positives itself, Benjamini and Hochberg [3] introduced the false discovery 

rate (FDR), defined by 
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where aMb — max(a, 6), FP denotes the number of false positives and R is the total number of rejections. 
A large part of the literature is devoted to the building of multiple testing procedures (MTP) that upper 
bound FDR as tightly as possible ([J, ISJ). For instance, that of Benjamini and Hochberg (BH-procedure) 
[3|] ensures the following inequality under independence 

FDR < TToa < a. 



where ttq denotes the unknown proportion of true null hypotheses, while a is the actual level at which 
we want to control the FDR. Since ttq is unknown, the BH-procedure suffers some loss in power, which 
is all the more deep as ttq is small. A natural idea to overcome this drawback is the computation of an 
accurate ttq estimator, which would be plugged in the procedure. Thus ttq appears as a crucial quantity 
that is to be estimated, hence the large amount of existing estimators. We refer to la, l6| for reviews on 



this topic. The randomness of this estimation needs to be taken into account in the assessment of the 
procedure performance f[lll.l24|). 

In many of quite recent papers about multiple testing (see [a, |9|, lid . Ill|). a two-component mixture 
density is used to describe the behaviour of p-values associated with the m tested hypotheses. As usual 
for mixture models, we need an assumption that ensures the identifiability of the model parameters. 
Thus, most of ttq estimators rely on the strong assumption that there are only p-values following a 
uniform distribution on [0, 1] in a neighbourhood of 1. However, Pounds et al. |l7| recently observed the 
violation of this key assumption. They pointed out that some p-values associated with induced genes 
may be artificially sent near to 1, for example when a one-sided test is performed while the non-tested 
alternative is true. To overcome this difficulty, we propose to estimate the density of p-values by some 
non-regular histograms, providing a new estimator of ttq that remains reliable in the Pounds' framework 
thanks to a relaxed " identifiability assumption" . 

In the context of density estimation with the quadratic loss and histograms, asymptotic considerations 
have been used by Scott ([22'|) for instance. A drawback of this approach relies on regularity assumptions 
made on the unknown distribution. Some AlC-type penalized criteria as in Barron et al. [l| could 
be applied as well. However, such an approach depends on some unknown constants that have to be 
calibrated at the price of an intensive simulation step (see [16[ in the regression framework). As it 
is both regularity-assumption free and computationally cheap, we address the problem by means of 
cross-validation, first introduced in this context by Rudemo ([l8|). More precisely, the leave- p-out cross- 
validation (LPO) is successfully applied following a strategy exposed in Celisse et al. [7|. Unlike Schweder 
and Spj0tvoirs estimator of ttq ([21|), ours is fully adaptive thanks to the LPO-based approach, e.g. it 
does not depend on any user-specified parameter. 

The paper is organized as follows. In Section 1, we present a cross-validation based estimator of ttq 
(denoted by ttq). Our main assumptions are specified and a description of the whole ttq estimation 
procedure is given. Section 2 is devoted to asymptotic results such as consistency of ttq. Then we propose 
aplug-in multiple testing procedure (plug-in MTP), based on the same idea as that of Genovese et al. 



Uj. It is compared to the BH-procedure in terms of power and its asymptotic control of the FDR 
is derived. Section 3 is devoted to the assessment of our ttq estimation procedure in a wide range of 
simulations. A comparison with other existing and widespread methods is carried out. The infiucnce of 
the ttq estimation on the power of the plug-in MTP is inferred as well. This study results in almost overall 
improved estimations of the proposed method. 

1 Estimation of the proportion of true null hypotheses 

1.1 Mixture model 

Let Pi, ... , Pm be m i.i.d. random variables following a density g on [0, 1]. Pi, ... , Pm denote the p- 
values associated with the m tested hypotheses. Taking into account the two populations of (Hq and 
Hi) hypotheses, we assume ( [a, |9|, lll[ ) that g may be written as 

Vx e [0, 1], gix) = 7ro/o(a:) + (1 - 7ro)/i(x), 

where /o (resp. /i) denotes the density of Hq (resp. Hi) p-values, that is p-values corresponding to true 
null (resp. false null) hypotheses, ttq is the unknown proportion of true null hypotheses. Moreover, we 
assume that /o is continuous, which ensures that /q = 1: Hq p-values follow the uniforme distribution 



Z^([0, 1]). Subsequently, the above mixture becomes 

yx e [0, 1], g{x) = TTo + (1 - 7ro)/i(x), (1) 

where both ttq and /i remain to be estimated. 

Most of existing ttq estimators rely on a sufficient condition which ensures the identifiability of ttq. This 

assumption may be expressed as follows 

3A*e]0,l]/ Vie{l,...,m}, P, G[A*,1]^P, ~Zi([A*,l]). (A) 

( A) is therefore at the origin of Schweder and Spj0tvoirs estimator ([2l|), further studied by Storey 
([2J, |25[). It depends on a cut-off A G [0, 1] from which only Hq p- values are observed. This estimation 
procedure is further detailed in Section[31 The same idea underlies the adaptive Benjamini and Hochberg 
step-up procedure described in [J|, based on the slope of the cumulative distribution function of p- values. 
If we assume A* = 1 (that is /i(l) = 0), Grenander Il2l a nd Storey et al. [26|] choose ^(1) to estimate 
ttq, where g denotes the estimator of g. Genovese et al. [11[ use (1 — G(i))/(1 — t), t S (0, 1) as an upper 
bound of TTo, which becomes (for t large enough) an estimator as soon as (A) is true. 
However, this assumption may be strongly violated as noticed by Pounds et al. [l7|. This point is detailed 
in Section [3.21 Following this remark, we propose the milder assumption (A'): 

3A*== [A*,/i*] c (0,1]/ yie{l,...,m}, P,eA*^ P^--U{A*). (A') 

While it is a generalization of (A), this assumption remains true in Pounds' framework as we will see 
in Section 13.21 . Scheid et al. 19] proposed a procedure named Twilight, which consists in a penalized 
criterion and provides, as a by-product, an estimation of ttq. Since this procedure does not rely on 
assumption (A), it should be taken as a reference competitor in the simulation study (Section [3]) with 
respect to our proposed estimators. 

1.2 A leave-p-out based density estimator 

If g satisfies (A'), any "good estimator" of this density on A* would provide an estimate of ttq. Since g 
is constant on the whole interval A*, we adopt histogram estimators. Note that we do not really care 
about the rather poor approximation properties of histograms outside of A* as our goal is essentially the 
estimation of A* and of the restriction of g to A*, denoted by g|A. in the sequel. 

For a given sample of observations Pi, ... , Pm and a partition of [0, 1] in D G N* intervals / — {Ik)k=i,...,D 
of respective length oj^ ~ j/^], the histogram 5"^^ is defined by 

D 

where m^ = tt{i e |1, m| : P, e h}- 

If we denote by S the collection of histograms we consider, the "best estimator" among S is defined in 

terms of the quadratic risk: 

s = ArgminEg [Ug-sJl^] , 
ses 

= ArgminJE<,[llsll2]-2 / s{x)g{x)dx\, (2) 

s£S [ J[OS] J 

where the expectation is taken with respect to the unknown g. According to ([2]), we define R by 

R{s)=Eg[\\s\\l]-2 f s{x)g{x)dx. (3) 



In ^ we notice that R still depends on g that is unknown. To get rid of this, we use a cross-validation 
estimator of R that will achieve the best trade-off between bias and variance. Following ([13|), we know 
that leave-one-out (LOO) estimators may suffer from some high level variability. For this reason we 
prefer the use of leave-p-out (LPO), keeping in mind that the choice of the parameter p will enable the 
control of the bias- variance trade-off. 

At this stage, we refer to Celisse et al. [7| for an exhaustive presentation the leave-p-out (LPO) based 
strategy. Hereafter, we remind the reader what LPO cross-validation consists in and then, give the main 
steps of the reasoning. First of all, it is based on the same idea as the well-known leave-one-out (see 
[13| for an introduction) to which it reduces for p = 1. For a given p £ [[l,m — IJ, let split the sample 
Pi, ... , Pm into two subsets of respective size m — p and p. The first one, called training set, is devoted 
to the computation of the histogram estimator whereas the second one (the test set) is used to assess 
the behaviour of the preceding estimator. These two steps have to be repeated ("*) times, which is the 
number of different subsets of cardinality p among {Pi, . . . , Pm}. 



Closed formula of the LPO risk This outlined description of the LPO leads to the following closed 

formula for the LPO risk estimator of Ri^^i) (see [7|): For any partition / = {Ik)k=i d of [0, 1] in D 

intervals of length ujk = \Ik\ and p e [l, ?7i — Ij, 

- -_ 2m -p sr^ ruk m{m-p+l) y^ 1 /m^x^ 

^ (m — l){m — p) ^-^ mujk (m — l){m — p) ^-^ ujk^ rn ) ^ 

where ra^ = tt{* G Ilj"^! • Pi G ^fc}, ^ = 1,...,!?. As it may be evaluated with a computational 
complexity of only O (m log m) , (j4|) means that we have a very efficient estimator of the quadratic risk 
P(s^). Now, we propose a strategy for the choice of p that relies on the minimization of the mean square 
error criterion (MSE) of our LPO estimator of the risk. Indeed among {Rpl's^) : p E |1,77t, — 1|}, we 
would like to choose the estimator that achieves the best bias-variance trade-off. This goal is reached 
by means of the MSE criterion, defined as the sum of the square bias and the variance of the LPO risk 
estimator. Thanks to (jl|), closed formulas for both the bias ^ and the variance ^ of LPO risk estimator 
may be derived. We recall here these expressions that come from ;7]. 

Bias and variance of the LPO risk estimator Let u; correspond to a Z?— partition {Ik)k of [0, 1] 
and for any k £ {1, . . . , D}, ak = Pr[Pi G Ik\ such that a — (ai, . . . , au) G [0, 1]-^. 
Then for any p £ [l, ?7i — Ij , 

Bp{uj) = Bp{a,uj)^— -> , (5) 

m(m - P) ^ ^k 

p'^(p2{m,a,u;)+pipi{m,a,u;) + (po{m,a,u}) 

Vpliv) = Vp[a,u;) = p- — -r^ , (6) 

[m(m — l)[m — p)]'' 



where 

D 

V(z,j) G {l,...,3}x{l,2}, s,^j=Y,a'Ju;l 

k=l 

ip2{m,a,uj) = 2m(TO- 1) [(to- 2)(s2,i + S14 - 33^2) - ?7iS2,2 - (2m- 3)s2,i] , 
(pi{m,a,uj) = -2TO(m- l)(3m + 1) [(m - 2)(s2,i - ■53,2) - ms2,2] + 

2m(TO - 1) [2(to + l)(2m - 3)s2^i + (-Sto^ + 3m + 4)si,i] , 
ipo{m,a,uj) = 4m(TO- 1)(to + 1) [(m - 2)(s24 - 33^2) - ms2,2] - 

2m(TO - 1) [(to^ + 2to + 1)(2to - 3)s2,i + (2to^ - 4to - 2)si,i] + 

m{m- l)^(si,2 - s?,i) • 

Plug-in estimators may be obtained from the preceding quantities by just replacing au with cxk = mk/vn 
in the expressions. Following our idea about the choice of p, we define for each (partition) u> the best 
theoretical value p* as the minimum location of the MSE criterion: 

p* = Argmin MSE{p) = Argmin {Bp{ujf + Vp{uj)} . (7) 

pe |l,m-l]] P 

The main point is that this minimization problem has an explicit solution named p^, as stated by Theorem 
3.1 in [7|. For the sake of clarity, we recall the MSE expression: 

Minimum location expression With the same notations as for the bias and the variance, we obtain 
for any a; e R, 

MSE{x) = — — -2 , 

[m[m — l)(rn — x)\ 



where <P3(m, a,uj) = (m — l)^(si^i — S2,i)^- 

Thus, we define our best choice p for the parameter p by 



P 



k (ps) , if » e [1, m - 1] , . 

1, otherwise ' ' 



where k{x) denotes the closest integer near to x and pig has the same definition as pj^, but with a instead 
of a in the expression. 

Remark: There may be a real interest in choosing adaptively the parameter p, rather than fixing p — 1. 
Indeed in the regression framework for instance, Shao [231| and Yang ,2S| underline that the simple and 
widespread LOO may be sub-optimal with respect to LPO with a larger p. In the linear regression set-up, 
Shao even shows that p/nn ^ 1 as ttt. — > +00 is necessary to get consistency in selection. 

1.3 Estimation procedure of ttq 
1.3.1 Collection of non-regular histograms 



^ max : 



We now precise the specific collection of histograms we will consider. For given integers Nmin < N„ 
we build a regular grid of [0, 1] in N intervals (of length 1/A^) with N e lNmin,Nmax}- For a couple of 
integers 0<fc<^<iV, we define a unique histogram made of first k regular columns of width 1/A'', 
then a wide central column of length {£ — k)/N and finally N — i thin regular columns of width 1/A''. 
An example of such an histogram is given in Figure [ij The collection iS of the histograms we consider is 
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Figure 1: Example of non-regular histogram in S. There are k = 7 regular columns from to A = k/N, 
a wide central column from X to iJ, — i/N, and N — £ — 7 regular column of width 1/A^ from /i to 1. 



defined by 



u 



^N, 



Ne \N„ 



,N,r 



where 



V7V, 



s 



N 



{s^ : Wk+i = {1- k)/N, w^ = l/Nfori^k + l, < k < £ < N} . 



Provided (A') is fulfilled, we expect for each N a selected histogram with its wide central interval [A, ^] 
close to A* . The comparison of all these histograms (one per value of N) enables to relax the dependence 
of each selected histogram on the grid width 1/A^. 

1.3.2 Estimation procedure 

Following the idea at the beginning of Section 11.21 ttq will consist of the height of the selected histogram 
on its central interval [A,/z]. More precisely, we propose the following estimation procedure for ttq. 
For each partition (represented here by the vector lo), we compute p(w) = Argrmii MS E{p, to), where 

MSE denotes the MSE estimator obtained by plugging mk/m in place of ak in expressions of ([7]). The 
best (in terms of the bias- variance trade-off) LPO estimator of the quadratic risk R{su}) is therefore 
Rp(u]){^)- Then we choose the histogram that reaches the minimum of the latter criterion over S. From 
this histogram, we finally get both the interval [A,/i], which estimates A*, and 



TTo =7ro(A,/I) 



def 



:|i : Pj e A,/i I 



m{'jl — A) 



These steps are outlined hereafter 
Procedure: 



1. For each partition denoted by w, define p(w) = Argmin M5'ii'(p, w). 

2. Find the best partition uj = Argmin^^ _Rp(ij)(a;). 

3. From lj, get (A,/2). 



4. Compute the estimator ttq — — — ,_ -, • 

m{p, — \ ) 

2 Asymptotic results 

2.1 Pointwise convergence of LPO risk estimator 

Lemma 2.1. Following the notations in Section ] 1.21 for any p G {l,m — 1} and lo, we have 

MSE{p,Lu)=0„,^+oo{l/m), 
Moreover if S2,2 + S3, 2 - si,i - S2,i 7^ 0, 

p{uj)/m — ^^^^ — > ^ooCt^), 

m — >+oo 

where iooi^) G [0, 1]. 
Proof. 

1. We see that 

^3 + 952 == 2m^[s24 + si,i ~ S3,2 - S2,2 - 2s2 ;^] +o(m^), 

(ySi = 2to**[3s3^2 + 3s2,2 - 3s2a +4s2_i - 3si,i] +o(m"'), 
(ySo = — 4m''^[s2 ;^ + Si,i] + o(m^). 

Thus for any p e {1, . . . , to — 1} and partition of size vector uj £ [0, 1]^ we have 

MSE{p,uj)^ 0^^+00 (- 
V m 



2. Simple calculations lead to 

Pr(^) 3(s24 - S3. 2 - 52,2) + 7si4 

TO m^ + 00 Si^i + S2.1 - S2,2 — S3_2 



= £{a,uj). 



a.s. 



As for any k a ^ ^^^ — > ak , the continuous mapping theorem implies the almost surely convergence. 

rn — >+oo 

Finally, the result follows by setting £{oj) = £{a,uj) and ^oo(w) = I{£(Q,aj)e[o,i]}^(Q^i'^)- 

D 

Proposition 2.1. For any given u, definep{uj) as in Section ll.^ and Lp(uj) — Rp{uj) + \\g\\2. If£oo{^) 7^ Ij 
we have 

7/ N rfe/ --^ , , P , , def .. ,,2 

L{uj) = Lp{uj) > L[U!) = ||.g-s^||2. 

m — ^+00 

Remark: Note that the assumption on £00 does seem rather natural. It means that the test set must be 
(at most) of the same size as the training set {p/{n — p) = C'p(l)). Moreover, ^oo(w) = 1 if and only if 
S2,i — 52,2 — S3, 2 — ~3 sii, that holds for very specific densities. 

Proof. The first part of Lemma [TT] implies that Rp{Lo) — R{^u]) > 0. Combined with R{su>) > 

771 — >+oo ra — *oo 

L{ijj) — II5III, it yields that for any fixed p, 

^ p 

Lp{ijj) > L{ui). 

m — >+oo 

Finally, the result follows from both the continuous mapping theorem and the assumption on ^00 • CH 



2.2 Consistency of ttq 

Wc first emphasize that for a given N e {-/Vmin, • • • ,-^max} any histogram in Sn is associated with a 
given partition of [0, 1] that may be uniquely represented by (N,X,fi). We give now the first lemma of 
the consistency proof. 

Lemma 2.2. For X* ^ fj,* ^ [0, 1], let s be a constant density on [A*,/i*]. Suppose Nmin such that for any 
Nmin < N, it exists a partition {N, A, fj,) satisfying 0</i— A<^* — A*. For a given N, let un represent the 
partition (iV, A^v, Mw) with Xj\[ = \NX*~\/N and fiN ~ 1^1^* \/N- Define s^^ as the orthogonal projection 
of s onto piecewise constant functions built from the partition associated with to. If the dimension of a 
partition is its number of pieces, then ujn is the partition with the smallest dimension satisfying 

UN e Argmin||s- s^\\l. 

Proof. For symmetry reasons, we deal with partitions, for a given N, made of regular columns of width 
1/N from to A and only one column from A to 1 {e.g. wc set /i = 1). In the sequel, /(^) denotes the 
partition associated with lon- 

1. Suppose that it exists loq such that s — s^^. Then \\s — s^„ Hi — and lun E Argmin^^ II* ^ ^ci^Hi- 

2. Otherwise, s does not equal to any s^. 

(a) If A* = k/N, then Xn — A*. Any subdivision / of /^-^^ satisfies ||s — Si^Hi = ||s — Si^wlli) 
where w corresponds to /. Now, let !Fi be the set of piecewise constant functions built from 
a partition /. For any partition / = {Ik)k such that Vfc, /) C Ik for a given £, then 
J^i C ^/(N). Thus ||s - Si^lli = ||s - s^„||2 + ||sw„ - s^Wl, since Sj^„ - s^ e J^j(n). Therefore, 
LON e Argmin^ ||s - s^H^. 

(b) If A* ^ {1/A^, • • • , !}• As before, any subdivision of I^^^ will have the same bias, whereas it is 



larger for any partition containing I^^'. So, oj^ G Argmin^ ||s — Sj^H^. 



D 



Lemma 2.3. With the same notations as before, we define L{uj) = ||s — St^lli- -^^^ L be a random process 

index 

then 



indexed by the set of partitions 17 such that L{lo') > L{lo'), for any o)' G O. //cD G Argmin^ L{lj), 

rn — *+oc 



L{lo) > min{L(a;) : lo G 17}. 

rn — *+oc 

Proof. Set r C il such that Vw G F, L{u}) — min^^'gn L{u}') and define 5 = miut^^t^/gr \L{uj) ~ L{u;')\/2. 
For |f7| — k and |F| = i, we have the ordered quantities L{uj^) = ■ ■ ■ = L(uj^) < L{lo^+^) < ■ ■ ■ < L(w^). 
Set e > 0. For each w', it exists nii (large enough) such that for m > nii, \L{u!'^) — i(w*)| < e, with high 
probability. For rrimax = niaxi nii, we get max^^gfi \L{lo) — L{uj)\ < e in probability. Thanks to the latter 
inequality and by definition of Q, 

L{lj) < L{uj) + e < L{uj) + e < L{uj) + 2e, in Probability 

for any a; G O \ F. Hence, we obtain 

L(a3) < min i(w) = L{u)^^^), in Probability. 

Thus, a; G F with high probability and the result follows. D 



Theorem 2.1. For < X* < ^* < I, let s : [0,1] t-^ [0,1] be a constant function on [A*,^*] such 

that s is not constant on any interval I with [X* , fi*] ^ I (if it exists). Suppose A^min such that for any 

Nmin < N < A^max, H cxists u partition {N, X, /i) satisfying 0</i--A</i* — A*. Set fl — Un^n, where 

Ojv denotes the partitions associated with Sn ■ If ttq is the estimator described in Section \1.3.2\ selected 

from f2, then 

p 



TTO 



?Tl— * + 00 



TTq. 



Proof For e > and Nn^i^ < N < N, 

Pr[|^o-^o| >e] = Pr 
< Pr 



A* + fi* 



[A,/i]5^[A*,M* 



A + /2 



> e 



Pr 



'■^LU '^LU I 



'2,[A,m] 



>eHp-X) 



< Pr[\L{Q)-LiujN)\> S]+Ft 



sup||sc^ -Su,\\l> e'^/N^ 



for some i5 > 



'2..l\.,p] 



denotes the quadratic norm restricted to [A, /I]). As the cardinaUty of the set 



of partitions is finite (A'max does not depend on m) 



Pr 



sup||s^-s„||^ > e^/N„ 



rn— »+oo 



-^0. 



We use the fohowing inequaUty \L{uj) — L{ujn)\ — \L{lj) — L{lj)\ < \L{uj) — L{lun)\ and the uniform 
convergence in probabihty oi L — L over fi (| J7| < +oo) to get 



Fr[\Liu)^L{ujN)\>S] < Pr \L{Q) - L{lun)\ > S' 
for some S' > 0. The resuU comes from both Lemma 12.21 and Lemma 



D 



2.3 Asymptotic optimality of the plug-in MTP 

The following is inspired by both |ll| and |25j . In the sequel, we will remind some of their results to 
state the link. First of all for any 6* € [0, 1], set 

Vte(0,l], QeW-7?4 and QeW = ^^ 



Git) 



G{ty 



where G (resp. G) denotes the (empirical) cumulative distribution function of p-values. Let define the 
threshold Ta{0) — T{a, 9, G) ~ sup{i G (0, 1) : Qg{t) < a}. Now we are in position to define our plug-in 
procedure: 

Definition 2.1 (Plug-in MTP). Reject all hypotheses with p-values less than or equal to the threshold 

Ta{7fo)- 

Storey et al. [25| established the equivalence between the BH-procedure and the procedure consisting 
in rejecting hypotheses associated with p-values less than or equal to the threshold Ta{i), named the 
step-up Ta{l) procedure. We may slightly extend Lemma 1 and Lemma 2 in [25| by using similar proofs, 
so that they are omitted here. 



Lemma 2.4. With the same notations as before, we have 



(i) the step-up procedure Ta{j:Q{Q, 1)) = 2^0,(1) is equivalent to the BH-procedure in that they both reject 
the same hypotheses, 

(ii) the step-up procedure Ta{TTa{\,'Jl)) is equivalent to the BH-procedure with m replaced by 7ro(A, /i). 

Thus, we observe that the introduction of ttq (supplementary information) in our procedure entails the 
rejection of at least as much hypotheses as the BH-procedure (Tq, in nonincreasing). Hence our plug-in 
procedure should be more powerful, provided it controls the FDR at the required level a. 
We settle this question now, at least asymptotically, thanks to a slight generalization of Theorem 5.2 in 



ll| to the case where G is not necessarily concave (see the " U-shape" framework described in Section 
for instance). For t G [0, 1], let define FP{t) (resp. R{t)) as the number of Hq (resp. the total number 
of) p-values lower than or equal to t and set T{t) = FP{t)/{R{t) V 1). Thus, 

Vte[0, 1], FDR{t)=¥.[T{t)]. 

Theorem 2.2. For any 5 > Q and a e [0, 7ro[, define tTq = ttq + 5. Assume that the density f o/ Hi 
p-values is differentiable and is nonincreasing on [0, A*], vanishes on [A*,/i*] and is nondecreasing on 
[/j,*,l]. Then 

(^) Qtto is increasing on /„ = ^^^^"'^([O, a]), 

(ii) E[r(r„(^^))] <a-)-o(l). 

Remarks: 

Note that the only interesting choice of a actually lies in [0, ttq). If a > ttq, then FDR{t) < a is satisfied 

in the non-desirable case where all hypotheses are rejected. 

A sufficient condition on G for the increase of Q^lo^ is that G were continuously differentiable and 

G'{t) < G{t)/t,Vt G (0,1]. Thus, G may be nondecreasing (not necessarily concave) and Q^ro nray 

increase yet. 

To prove Theorem 12.21 we first need a useful lemma, the technical proof of which is deferred to 
Appendix. 

Lemma 2.5. With the above notations, for any a G (0,1], T{a,-,G) : [0,1] i— > [0,1] is continuous 
a.s. . Moreover for any 6 (£ [0,1], Gi^T[a,9,G) is continuous on B^{^,\]), the set of positive bounded 
functions on [0, 1], endowed with the \\ ■ ||oo- 

Proof. (Theorem HHI) 

(i) As / is differentiable and nonincreasing, G is concave on [0,/i*] and Qttq increases on this interval. 
Following the above remarks, Q^^ is still increasing provided G'{t) < G{t)/t for t e [^*, 1]. Thus 
provided G'{t) < G{t)/t, V< G [a^*,1], Q increases on [/x*,l]. Otherwise, there exists to G [At*,l] 
such that G"(to) — G{to)/to- Then, the increase of/ ensures that G{x)/x < G'{x), Vx > t^. Hence, 
Qtto is nonincreasing on [to, 1]. Finally since Q{ttq) — 1, Q^o is increasing on /„. 

(ii) Rewrite first the difference 

r (^(a,5?o^G)) ~a = T {T{a,^lG)) - Q^„ (r(a,^o',G)) 

+ Q,„ (r(a, ^o', G)) - Q.„ (ria, 4,G)) (9) 

+ Q^„ (r(a, 4, G)) - Q^„ (r(a, 4, G)) (10) 

+ Q^,{Tia,4,G))--a. (11) 

10 



Set 77 > such that 2ri < T{a, 7t^,G). Note that 



Thus thanks to Lemma [^751 



T{a,^lG)<ii 



< 



[T{a,4,G)<v + op{l)] 



0. 



rn — >+oo 



Besides, both Theorem 4.4 of [ll| and Prohorov's theorem ([27'|) imply that 

1 



\Vrn(r-Q^g) ||oo,[r,,i] 



FT 



^(a,^*,G)) -g,„ (T(a,^o^G))] = o{l) 



= 0(1). 



Hence E 

Thanks to Lemma l2.5l the uniform continuity of Q^,, combined with the convergence in probabiUty 

of ttq ensure that the expectation of ^ is of the order of o(l). 

Since T{a,TT^,G) — sup{t : QttoC^) 5: aTTo/TTo}, /3 — ttq/ttq < 1 and Q^^ is a one-to-one mapping 

on /, we get Q^„ (T(a,7r*,G)) == g„„ (Q-„i(a/3)) = "/?■ Thus, 

Q,„ (r(a, 4, G)) - Q,„ (r(a, 4, G)) = g,„ (T(a/3, ttq, G)) - a/5 , 

Theorem 5.1 (lllj) apphed with a/3 instead of a and to — Q^^{a0) entails that the expectation of 

(flU)) is 0(1) as weU. 

Finally, (HI]) is equal to (/3 - l)a < 0. 

D 



3 Simulations and Discussion 

3.1 Comparison in the usual framework (/i = 1) 

By "usual framework" , we mean that the unknown /i in the mixture IT} is a decreasing density satisfying 
assumption (A): it vanishes on an interval [A*, 1] with A* possibly equal to 1. In this framework, 



TTO 



U^/P^&[X,l]} 



to(1- A) 
Except A, this general expression was introduced by Schweder et al. [2l|. Their estimator 

''o (^)= m(l-A) ' 
is based on (A) and strongly depends on the parameter A G [0, 1] that is supposed to be given, but 



e para 
[i3) is 



totally unknown in practice. A crucial issue ([15l|) is precisely the determination of an 'optimal' A 
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Table 1: Results for the two simulation conditions (A*,s) = (0.2,4) and (A*,s) — (0.4,6). The LPO and 
LOO based methods are compared to the Schweder and Spj0tvoll estimator, tt^* computed with A = 0.5. 
(All displayed quantities are multiplied by 100.) 



TTo = 0.9 


A* = 0.2, s = 4 


A* = 0.4, s ^ 6 


Method 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


LPO 
LOO 

^1' 


0.39 
0.46 

-0.15 


2.5 
2.3 
3.2 


6.41 10-2 
5.52 10-2 
9.94 10-2 


0.56 
0.61 

0.24 


2.8 
2.7 
3.1 


8.00 10-2 
7.66 10-2 
9.58 10-2 



3.1.1 A potential gain in choosing A 

In 2002, Storey [24| studied further this estimator and even proposed ([26[) the systematic value A = 0.5 
as a quite good choice. In the following, we show that even if assumption (A) is satisfied for A* = 0.2 or 
0.4, there is a real potential gain in choosing A in an adaptive way. 

In the following simulations, the unknown density /i in the mixture ([1]) is a beta density on [A*, 1] with 
parameter s: 

/i(i) = s/A*(l~i/A*)^-'l[o,A.](i), 

where (A*, s) G {(0.2,4), (0.4,6)} . The beta distribution is all the more sharp in the neighbourhood of 
as s is large. The proportion ttq is equal to 0.9, the sample size m — 1000 while n — 500 repetitions have 
been made. There does not seem to be any strong sensitivity to the choice of N,nax (data not shown 
here), as long as N^ax is obviously not too small. Until the end of the paper, Nmin = 1 and N^ax = 100. 
Table [T] shows the simulation results for the leave-p-out [LPO) and the leave-one-out {LOO) based 
estimators of ttq, compared to that of Schweder and Spj0tvoll for A — 0.5 denoted by ttq'. We see that 
in both cases, LPO is less biased than LOO but slightly more variable, which leads to a higher value for 
the MSE. This larger variability may be due to the supplementary randomness induced by the choice of 
A. Both LPO and LOO seem a bit conservative unlike ttq*, which is however a little less biased. We say 
that an estimator of ttq is conservative as soon as it upperbounds ttq on average. The main conclusion 
is that the MSE of LPO (and LOO) is always lower than that of ttq*, even if the assumption (A) is 
satisfied (A = 0.5 > A*). An adaptive choice of A may provide a more accurate estimation of ttq, which 
is all the more important as m grows. 



3.1.2 Comparison when A* = 1 

We consider now the general (more difficult) case when (A) is only satisfied for A* = 
beta density of parameter s : fi{t) = s(l — i)*-^, t G [0, 1], with s G {5, 10, 25, 50}. 
m = 1000 and ttq G {0.5, 0.7, 0.9, 0.95}. Each condition has been repeated n = 500 times, 
four of the different methods that have been compared in this framework. 



1. Thus, /i is a 
The sample size 
We detail below 



Smoother and Bootstrap 

In [26[, the authors proposed a method consisting in first computing the Schweder and Spj0tvoll estimator 

on a regular grid of [0, 1] and then adjusting a cubic spline. The final estimator of ttq is the resulting 

function evaluated at 1. This procedure is called Smoother. 

The Bootstrap method was introduced in 25| . Authors define the optimal value of A as the minimizer 

of the MSE of their ttq estimator. Since this quantity is unknown, they use an estimation based on 

bootstrap. They also need to compute 7ro(A) for values of A on a preliminary grid of [0,1]. 

These methods are available as options of the qvalue function in the R- package qvalue [26| . 
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Adaptive Benjamini-Hochberg procedure 

In the sequel, this procedure is denoted by ABH and we refer to [J] for a detailed description. In outline, 
the method relies on the idea that the plot of p-values versus their ranks should be (nearly) linear 
for large enough p-values (likely Hq p-values). The inverse of the resulting slope provides a plausible 
estimator based on assumption (A). 
The ABH procedure may be applied through the function pval. estimate. etaO in package fdrtool with the 



option method= " adaptive" http : //cran . r-pro j ect . org/src/contrib/Descr iptions/f drtool . html 



Twilight 

In their article, Schcid et al. 19] proposed a penalized criterion based on assumption (A'). This is a sum 
of the Kolmogorov-Smirnov score and a penalty term. The whole criterion is expected to provide the 
widest possible set of Hq hypotheses. How the penalty term balances against the Kolmogorov-Smirnov 
score depends on a constant C that is to be determined. To do so, the authors propose to use bootstrap 
combined with Wilcoxon tests. Besides, this procedure is iterative and strongly depends on the length of 
the data, which could be a serious drawback with increasing data sets. 
The function twilight is available in package twilight [20| ■ 

Results 

As in the preceding simulation study, LPO and LOO refer to the proposed methods. Figure [5] illustrates 
the performances for all the methods but ABH, for which results are quite poor with respect to other 
methods (see Table [2|). We notice that both Stsm and St Boot have systematically larger MSE than the 
three remaining approaches. Our methods give quite similar results to each other in this framework. 
Twilight, LPO and LOO furnish nearly the same MSE values in the most difficult case s — 5, when 
ttq > 0.5. Except for ttq = 0.5 and s — 5, LPO and LOO all the more outperform upon Twilight as the 
proportion raises. The better performance oi Twilight in this set-up may be due to the classical difference 
between cross-validation and penalized criteria. Indeed in the context of supervised classification for 
instance, Kearns et al. 1^ and Bartlett et al. [2| show that cross-validation is used to providing good 



results, provided the noise level of the signal is not too high. Otherwise, penalized criteria (like Twilight) 
outperform upon cross-validation. In the present context, s = 5 means that Hi p-values are spread on a 
large part of [0, 1] and not only concentrated in a neighbourhood of 0, while ttq = 0.5 indicates a larger 
number of Hi p-values in the distribution tail of the Beta density. Thus this situation may be held 
as the counterpart of the noisy case in supervised classification. Nevertheless, LPO and LOO always 
outperform Twilight when ttq > 0.5. They are even uniformly better than Twilight for ttq = 0.95, that 
is for small proportions of Hi hypotheses. 

3.2 Comparison in the U-shape case 

The 'U-shape case' refers to the phenomenon underlined by Pounds et al. [17| on a real data set made of 
Affymetrix 'pooled' present-absent p-values (one p- value per probe set). We explore the behaviour of the 
preceding methods applied to p-values with similar distributions. In our simulation design, the sample is 
m = 1000, while ttq € {0.25,0.5,0.7,0.8,0.9} and n = 200 repetitions of each condition have been made. 
Typically, the U-shape case appears when one-sided tests are made whereas the non-tested alternative 
is true. For example, suppose the test statistics are distributed as a three-component gaussian mixture 
model 

^oAA(0,2.5 10-2) + i_Z^ [jV{a,e^)+Mib,i^^)] , (12) 

where a < 0, & > and O,^ > 0, corresponding to respectively non-induced, under-expressed and over- 
expressed genes. We want to test whether genes are over-expressed, that is Hq : 'the mean equals 0" 
versus Hi : 'the mean is positive'. A test statistic drawn from M{a,6'^) (under-expressed gene) is more 
likely to have a larger p- value than those under M{b,v'^), which correspond actually to over-expressed 
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Proportion 0.5 



Proportion 0.7 




3.5 4 



Figure 2: Graphs of the MSE of the ttq estimator versus logs, where s is the parameter of the Beta 
density. Each graph is devoted to a given proportion, from 0.5 to 0.95 . Stsm denotes the MSE obtained 
for Smoother, Stsoot that of Bootstrap while Twil states for Twilight. 

genes. This phenomenon is clearly all the more deep as the gap between a and b is high and variances 6^ 
and v^ are small. Note that a similar shape may be observed when test statistics are ill-chosen. 
In order to mimic Pounds' example, we use P^ with —a = 6 G {1,1.5} and 6 = ly E {0.5,0.75}. As 
they were quite similar, results in these different conditions are gathered in Table [3] Except LPO and 
LOO for which this phenomenon is not so strong, any other method all the more overestimates ttq as 
the proportion of p- values under the uniform distribution is small. In our framework, a growth in ttq 
entails an increase in the right part of the histogram near 1, which is responsible for the overestimation 
(violation of assumption (A)). On the contrary when ttq = 0.9, the violation of assumption (A)) is 
weaker and similar values of MSE are obtained for the competing approaches. In this set-up, LPO, 
LOO and St Boot provide systematically the lowest MSE values. In comparison, it is somewhat surprising 
that Twilight overestimates ttq so much, since it should have remained reliable under assumption (A'). 
Despite the preceding simulation results, we observe a repeated overestimation, which means that the 
criterion under-penalizes large sets of p-values. The involved penalty may have been designed for the 
situation before (with only one peak near 0), whereas it may be no longer relevant in this framework. 
This may be interpreted as a consequence of the higher adaptivity of cross-validation based methods over 
penalized criteria. Finally it is worth noticing that both the bias and the MSE of LPO are systematically 
lower than those of LOO, showing the interest of choosing p in an adaptive way. 
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Table 2: Numerical results for different ttq estimators with s = 10 and ttq G {0.5,0.7,0.9,0.95}. Four 
other methods are compared to LPO and LOO. Stsm denotes Smoother, StBoot states for Bootstrap 
and Twil for Twilight. (All displayed quantities are multiplied by 100.) 



TTO 


0.5 


0.7 


Method 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


LPO 


1.4 


3.5 


14.5 10-^ 


1.4 


3.4 


13.6 10-^ 


LOO 


1.6 


3.4 


13.9 10-^ 


1.6 


3.3 


13.4 10-^ 


Stsm 


-0.9 


5.1 


26.2 10-^ 


-0.9 


6.0 


36.2 10-^ 


StBoot 


-2.3 


4.0 


20.9 10-^ 


-3.3 


4.7 


33.3 10-^ 


Twil 


-1.0 


3.6 


14.0 10"^ 


-1.5 


4.2 


19.4 10"^ 


ABH 


37.9 


8.3 


15.0 


0.27 


2.4 


7.6 



TTO 


0.9 


0.95 


Method 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


LPO 


0.8 


3.6 


13.7 10-^ 


0.5 


3.1 


9.5 10-^ 


LOO 


1.0 


3.4 


12.5 10-^ 


0.7 


2.9 


8.9 10-^ 


Stsm 


-0.5 


6.6 


43.1 10-^ 


-1.0 


5.5 


30.8 10-^ 


StBoot 


-3.7 


5.4 


43.4 10-^ 


-3.7 


5.1 


39.6 10-^ 


Twil 


-1.6 


4.4 


21.8 10-^ 


-1.6 


4.2 


20.2 10-^ 


ABH 


9.8 


0.4 


95.5 10-^ 


4.9 


0.1 


24.1 10-^ 



Table 3: Results of the U-shape case for the six compared methods for ttq G {0.25, 0.5, 0.7, 0.8, 0.9}. (All 
displayed quantities are multiplied by 100.) 



TTO 


0.25 


0.5 


0.7 


Method 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


LPO 


5.5 


6.2 


0.7 


5.5 


5.2 


0.6 


5.3 


4.4 


0.5 


LOO 


6.2 


5.7 


0.7 


6.8 


5.7 


0.8 


6.6 


4.8 


0.7 


St.Sm 


75.0 





56.0 


50.0 





25.0 


30.0 





9.0 


St_Bo 


43.2 


3.2 


18.7 


28.9 


2.2 


8.4 


17.4 


1.6 


3.0 


Twil 


73.2 


2.5 


53.6 


47.5 


3.0 


22.6 


27.4 


2.3 


8.0 


ABH 


45.5 


5.4 


21.0 


31.4 


4.2 


10.0 


19.8 


3.1 


4.0 



TTO 


0.8 


0.9 


Method 


Bias 


Std 


MSE 


Bias 


Std 


MSE 


LPO 


5.3 


4.1 


0.4 


4.2 


2.7 


0.2 


LOO 


6.4 


4.1 


0.6 


4.7 


2.5 


0.3 


St.Sm 


20.0 





4.0 


9.9 


0.2 


1.0 


St.Bo 


11.6 


1.3 


1.0 


5.4 


1.6 


0.3 


Twil 


17.5 


1.8 


3.0 


8.0 


1.3 


0.7 


ABH 


13.8 


2.3 


2.0 


7.4 


1.3 


0.6 
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Table 4: Values of the empirical estimate of the FDR (%) for the LPO (FDRlpo), LOO {FDRloo), 
Twilight {FDRtwu), Benjamini-Hochberg (FDRbh) and Oracle {FDRsest) procedures, s denotes the 
parameter of the Beta distribution used to generate the data. 



s 


TTO 


FDRlpo 


FDRloo 


FDRtwU 


FDRbh 


FDRsest 


5 


0.5 


14.15 


14.06 


14.85 


8.35 


14.29 


0.7 


14.13 


14.03 


14.85 


10.40 


14.50 


0.9 


15.01 


15.01 


15.73 


14.26 


14.81 


0.95 


13.23 


13.43 


13.76 


13.13 


13.83 


10 


0.5 


14.74 


14.69 


15.50 


6.94 


15.02 


0.7 


15.14 


15.09 


15.61 


10.29 


15.12 


0.9 


17.91 


17.90 


18.08 


15.85 


17.94 


0.95 


14.65 


14.65 


15.25 


14.37 


14.95 


25 


0.5 


14.88 


14.82 


15.51 


7.48 


15.04 


0.7 


14.69 


14.64 


15.19 


10.47 


14.84 


0.9 


15.50 


15.57 


16.31 


13.56 


15.92 


0.95 


14.35 


14.22 


14.51 


13.19 


14.19 


50 


0.5 


14.76 


14.71 


15.42 


7.40 


14.89 


0.7 


14.81 


14.77 


15.23 


10.36 


14.87 


0.9 


13.93 


13.82 


14.79 


13.17 


13.98 


0.95 


16.12 


16.32 


16.57 


14.65 


16.08 



3.3 Power 

Here, we study the influence of the estimation of ttq on the power of multiple testing procedures obtained 
as described in Section 13.1.21 for various ttq estimators. The Twilight method is used for comparison, 
in association with the Benjamini-Hochberg procedure ([Sj). Our reference is what we call the Oracle 
procedure, which consists in plugging the true value of ttq in the MTP procedure of Section [3. 1.21 The 
same simulations as in Section 13.1.21 are used for this study, which is carried out in two steps. In the 
first one, we compare procedures in terms of their empirical FDR, in order to assess the expected 
control for finite samples. Thus, we choose the level a = 0.15 at which we want to control the FDR 
and then compute, for each of the n = 500 samples, the corresponding FDP in the terminology of 
11| . e.g. the ratio of the number of falsely rejected hypotheses over the total number of rejections. 
Finally, we get an estimator of the actual FDR: FDR by averaging the simulation results. Table [H gives 
results for the LPO and LOO based procedures FDRlpo, FDRloo and also for Twilight {FDRtwu), 
Benjamini-Hochberg (FDRbh) and Oracle procedures (FDRBest)- In the second step, we check the 
potential improvement in power enabled by the LPO-based MTP with respect to the BH-procedure. The 
assessment of this point is made in terms of the expectation of the proportion of falsely non-rejected 
hypotheses among true alternatives (named FNR here). This criterion is estimated by the average of 
the preceding ratio computed from each sample. Table [3 displays the empirical FNR values, denoted 
by FNRlpo: FNRloo, FNRtwiU FNRbh and FNRBest respectively for the LPO, LOO, Twilight, 
Benjamini-Hochberg and Oracle procedures. In both steps of this study, s denotes the parameter of the 
Beta distribution that was used to simulate the data. 



In comparison to the Oracle procedure (with the true ttq). Table [4] shows that the LPO procedure pro- 
vides an actual value of the FDR that is almost always very close to the best possible one. Moreover in 
nearly all conditions, LPO outperforms its LOO counterpart and remains a little bit conservative, e.g. 
it furnishes a FDR that is lower or equal to the desired level a. This observation empirically confirms 
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Table 5: Average proportion of falsely non-rejected hypotheses (%) for the LPO (FNRlpo), LOO 

(FNRloo), Twilight (FNRtwu), Benjamini-Hochberg (FNRbh) and Oracle (FNRBest) procedures. 
s denotes the parameter of the Beta distribution used to generate the data. 



s 


TI'O 


FNRlpo 


FNRloo 


FNRTn^a 


FNRbh 


FNRBest 


5 


0.5 


93.94 


94.22 


91.64 


99.78 


94.16 


0.7 


99.65 


99.65 


99.59 


99.80 


99.63 


0.9 


99.87 


99.87 


99.86 


99.89 


99.86 


0.95 


99.91 


99.91 


99.90 


99.92 


99.91 


10 


0.5 


25.69 


25.91 


22.01 


96.83 


23.22 


0.7 


96.36 


96.44 


95.08 


99.16 


96.03 


0.9 


99.56 


99.56 


99.54 


99.64 


99.56 


0.95 


99.76 


99.76 


99.76 


99.77 


99.74 


25 


0.5 


0.88 


0.90 


0.70 


17.72 


0.79 


0.7 


22.83 


23.04 


20.85 


61.00 


21.93 


0.9 


97.89 


97.89 


97.68 


98.49 


97.86 


0.95 


99.16 


99.16 


99.06 


99.23 


99.14 


50 


0.5 


0.96 


0.92 


0.64 


1.58 


0.72 


0.7 


2.26 


2.30 


2.01 


10.07 


2.19 


0.9 


82.40 


82.47 


80.39 


88.05 


82.08 


0.95 


96.74 


96.76 


96.60 


97.15 


96.74 



the result stated in Theorem l2.2l Besides as expected, the estimation of ttq entails a tighter control than 
that of the BH-procedure where ttq — 1. Unlike the proposed methods, Twilight fails in controlling the 
FDR at the desired level since FDRtwU is very often larger than FDRbcsi (the best reachable value), 
and even larger than a. Subsequently, Twilight should not enter in the comparison of methods in terms 
of power. 

Table [S] enlightens that proportions of false negatives may be very high in most of the simulation condi- 
tions, as shown by the Oracle procedure. Nevertheless, FNRlpo remains very close to the ideal one. As 
a remark, note that the Twilight FNR estimates are also close to the Oracle values, but nearly always 
lower. As suggested by FDR results, LOO is less powerful that LPO, whereas both of them outperform 
by far the BH-procedure. Note that the proportion of false negatives strongly decreases when s grows, 
which means that Hi p- values are more and more concentrated in the neighbourhood of 0. As the interval 
on which assumption (A) is satisfied is wider, the problem becomes easier. Besides, we observe a fall in 
power when ttq grows in general. Indeed for small proportion of true alternatives, the "border" between 
the two populations of p-values is more difficult to define as a large number of Hi p-values behave like 
Hq ones. Finally note that very often, the LPO procedure shares (nearly) the same power as the Oracle 
one. 



3.4 Discussion 

In this article, we propose a new estimator of the unknown proportion of true null hypotheses ttq. It 
relies on first the estimation of the common density of p-values by use of non-regular histograms of a 
special type, and secondly on the leave-p-out cross-validation. The resulting estimator enables more 
flexibility than numerous existing ones, since at least it is still convenient in the "U-shape" case, without 
any supplementary computational cost. 

Our estimator may be linked with that of Schweder and Spj0tvoll for which almost only theoretical 
results with A fixed have been obtained by Storey. However unlike the latter, we provide a fully adaptive 
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procedure that does not depend on any user-specified parameter. Thus, asymptotic optimahty results 
are here derived with A = A. They assert, for instance, that the asymptotic exact control of the FDR 
with our plug-in MTP is reached. 

Eventually, a wide range of simulations enlighten that the proposed ttq estimator realizes the best bias- 
variance tradeoff among all tested estimates. Moreover, the proposed plug-in procedure is (empirically) 
shown to provide the expected control on the FDR (for finite samples), while being a little more powerful 
than its LOO counterpart. Moreover, the results in Section lX^ confirm the interest in choosing adaptively 
the parameter p rather than the usual p = 1 value. The LPO procedure is very often almost as powerful 
as the best possible one of this type, obtained when ttq is known. 

4 Appendix 



Proof. (Lemma [ 

First, we show that T{a,-,G) is right (resp. left) continuous on [0,1) (rcsp. (0,1]). As it is a similar 

reasoning, we only deal with right continuity. 

Let (e„)„ G (K+) denote a sequence decreasing towards 0. For any 9 G (0, 1], set Vn, r„ = T{a,9 + 
e„, G) a.s. . Then (r„)„ is an almost surely convergent increasing sequence, upper bounded by T{a, 6, G). 
To prove that T(a, 6', G) is its limit, we show that for any (5 > 0, there exists e > satisfying T{a, 9 + 
e, G) > T{a,9,G) - S. Notice that there exists ry > s.t. T := T{a,9,G) = sup{t G [r/, 1] : Qe(t) < 
a}. Then for 0<S<ri, T — 5 = sup oi G [rj — S,l — 5] : -^ J < a\ ■ Provided S is small enough, 

G{u + S) = G{u), Vw. Hence, T - S ^ snp {u e [t] - 6,1 - 6] : Jf- + Jf- < a] , and r(a, 9 + e,G) = 

sup I f G [0, 1] : -4t + ^VT < Q^ r • Thus, any Q < e < 59 provides the result. 

L G[t) G{t) J 

For the second point, define G G S+([0, 1]) and for any sequence (e„)„ G (K+) decreasing towards 0, let 

{Hn)n G (S+([0, 1])) denote a sequence of positive boimded functions satisfying Vn, \\G — -ff„||oo < £«• 
Then for large enough n, we have 

9t et ( 

< a ^ -—- < a 1 



G(t) - 6„ - G{t) - V G{t) 

and a(l — en/||G||oo) < ol. Thus, r„ = sup{i : 9t/{G(t) + e„) < a\ denotes an increasing sequence 
that is bounded by T{a,9,G). Moreover as (£„)„ decreases towards 0, r„ is as close as we want to 
T{a, 9, G). The same reasoning may be followed with r^ = sup{i : 9t/{G{t) — e„) < a}, which concludes 
the proof. D 
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